841 research outputs found

    A Strategy For Identifying Putative Causes Of Gene Expression Variation In Human Cancer

    Get PDF
    There is often a need to predict the impact of alterations in one variable on another variable. This is especially the case in cancer research, where much effort has been made to carry out large-scale gene expression screening by microarray techniques. However, the causes of this variability from one cancer to another and from one gene to another often remain unknown. In this study we present a systematic procedure for finding genes whose expression is altered by an intrinsic or extrinsic explanatory phenomenon. The procedure has three stages: preprocessing, data integration and statistical analysis. We tested and verified the utility of this approach in a study, where expression and copy number of 13,824 genes were determined in 14 breast cancer samples. The expression of 270 genes could be explained by the variability of gene copy number. These genes may represent an important set of primary, genetically "damaged" genes that drive cancer progression

    Classification of unknown primary tumors with a data-driven method based on a large microarray reference database

    Get PDF
    We present a new method to analyze cancer of unknown primary origin (CUP) samples. Our method achieves good results with classification accuracy (88% leave-one-out cross validation for primary tumors from 56 categories, 78% for CUP samples), and can also be used to study CUP samples on a gene-by-gene basis. It is not tied to any a priori defined gene set as many previous methods, and is adaptable to emerging new information

    The impact of RNA sequence library construction protocols on transcriptomic profiling of leukemia

    Get PDF
    Background: RNA sequencing (RNA-seq) has become an indispensable tool to identify disease associated transcriptional profiles and determine the molecular underpinnings of diseases. However, the broad adaptation of the methodology into the clinic is still hampered by inconsistent results from different RNA-seq protocols and involves further evaluation of its analytical reliability using patient samples. Here, we applied two commonly used RNA-seq library preparation protocols to samples from acute leukemia patients to understand how poly-A-tailed mRNA selection (PA) and ribo-depletion (RD) based RNA-seq library preparation protocols affect gene fusion detection, variant calling, and gene expression profiling. Results: Overall, the protocols produced similar results with consistent outcomes. Nevertheless, the PA protocol was more efficient in quantifying expression of leukemia marker genes and showed better performance in the expression-based classification of leukemia. Independent qRT-PCR experiments verified that the PA protocol better represented total RNA compared to the RD protocol. In contrast, the RD protocol detected a higher number of non-coding RNA features and had better alignment efficiency. The RD protocol also recovered more known fusion-gene events, although variability was seen in fusion gene predictions. Conclusion: The overall findings provide a framework for the use of RNA-seq in a precision medicine setting with limited number of samples and suggest that selection of the library preparation protocol should be based on the objectives of the analysis.Peer reviewe

    Germline EMSY sequence alterations in hereditary breast cancer and ovarian cancer families

    Get PDF
    Background: BRCA1 and BRCA2 mutations explain approximately one-fifth of the inherited susceptibility in high-risk Finnish hereditary breast and ovarian cancer (HBOC) families. EMSY is located in the breast cancer-associated chromosomal region 11q13. The EMSY gene encodes a BRCA2-interacting protein that has been implicated in DNA damage repair and genomic instability. We analysed the role of germline EMSY variation in breast/ovarian cancer predisposition. The present study describes the first EMSY screening in patients with high familial risk for this disease.Methods: Index individuals from 71 high-risk, BRCA1/2-negative HBOC families were screened for germline EMSY sequence alterations in protein coding regions and exon-intron boundaries using Sanger sequencing and TaqMan assays. The identified variants were further screened in 36 Finnish HBOC patients and 904 controls. Moreover, one novel intronic deletion was screened in a cohort of 404 breast cancer patients unselected for family history. Haplotype block structure and the association of haplotypes with breast/ovarian cancer were analysed using Haploview. The functionality of the identified variants was predicted using Haploreg, RegulomeDB, Human Splicing Finder, and Pathogenic-or-Not-Pipeline 2.Results: Altogether, 12 germline EMSY variants were observed. Two alterations were located in the coding region, five alterations were intronic, and five alterations were located in the 3'untranslated region (UTR). Variant frequencies did not significantly differ between cases and controls. The novel variant, c.2709 + 122delT, was detected in 1 out of 107 (0.9%) breast cancer patients, and the carrier showed a bilateral form of the disease. The deletion was absent in 897 controls (OR = 25.28; P = 0.1) and in 404 breast cancer patients unselected for family history. No haplotype was identified to increase the risk of breast/ovarian cancer. Functional analyses suggested that variants, particularly in the 3'UTR, were located within regulatory elements. The novel deletion was predicted to affect splicing regulatory elements.Conclusions: These results suggest that the identified EMSY variants are likely neutral at the population level. However, these variants may contribute to breast/ovarian cancer risk in single families. Additional analyses are warranted for rare novel intronic deletions and the 3'UTR variants predicted to have functional roles

    FLT3-ITD allelic ratio and HLF expression predict FLT3 inhibitor efficacy in adult AML

    Get PDF
    FLT3 internal tandem duplication (FLT3-ITD) is a frequent mutation in acute myeloid leukemia (AML) and remains a strong prognostic factor due to high rate of disease recurrence. Several FLT3-targeted agents have been developed, but determinants of variable responses to these agents remain understudied. Here, we investigated the role FLT3-ITD allelic ratio (ITD-AR), ITD length, and associated gene expression signatures on FLT3 inhibitor response in adult AML. We performed fragment analysis, ex vivo drug testing, and next generation sequencing (RNA, exome) to 119 samples from 87 AML patients and 13 healthy bone marrow controls. We found that ex vivo response to FLT3 inhibitors is significantly associated with ITD-AR, but not with ITD length. Interestingly, we found that the HLF gene is overexpressed in FLT3-ITD+ AML and associated with ITD-AR. The retrospective analysis of AML patients treated with FLT3 inhibitor sorafenib showed that patients with high HLF expression and ITD-AR had better clinical response to therapy compared to those with low ITD-AR and HLF expression. Thus, our findings suggest that FLT3 ITD-AR together with increased HLF expression play a role in variable FLT3 inhibitor responses observed in FLT3-ITD+ AML patients.Peer reviewe

    Alignment of gene expression profiles from test samples against a reference database: New method for context-specific interpretation of microarray data

    Get PDF
    Alignment of gene expression profiles from test samples against a reference database: New method for context-specific interpretation of microarray data Kilpinen, Sami K Ojala, Kalle A Kallioniemi, Olli P England BioData mining BioData Min. 2011 Mar 31;4:5. engBACKGROUND: Gene expression microarray data have been organized and made available as public databases, but the utilization of such highly heterogeneous reference datasets in the interpretation of data from individual test samples is not as developed as e.g. in the field of nucleotide sequence comparisons. We have created a rapid and powerful approach for the alignment of microarray gene expression profiles (AGEP) from test samples with those contained in a large annotated public reference database and demonstrate here how this can facilitate interpretation of microarray data from individual samples. METHODS: AGEP is based on the calculation of kernel density distributions for the levels of expression of each gene in each reference tissue type and provides a quantitation of the similarity between the test sample and the reference tissue types as well as the identity of the typical and atypical genes in each comparison. As a reference database, we used 1654 samples from 44 normal tissues (extracted from the Genesapiens database). RESULTS: Using leave-one-out validation, AGEP correctly defined the tissue of origin for 1521 (93.6%) of all the 1654 samples in the original database. Independent validation of 195 external normal tissue samples resulted in 87% accuracy for the exact tissue type and 97% accuracy with related tissue types. AGEP analysis of 10 Duchenne muscular dystrophy (DMD) samples provided quantitative description of the key pathogenetic events, such as the extent of inflammation, in individual samples and pinpointed tissue-specific genes whose expression changed (SAMD4A) in DMD. AGEP analysis of microarray data from adipocytic differentiation of mesenchymal stem cells and from normal myeloid cell types and leukemias provided quantitative characterization of the transcriptomic changes during normal and abnormal cell differentiation. CONCLUSIONS: The AGEP method is a widely applicable method for the rapid comprehensive interpretation of microarray data, as proven here by the definition of tissue- and disease-specific changes in gene expression as well as during cellular differentiation. The capability to quantitatively compare data from individual samples against a large-scale annotated reference database represents a widely applicable paradigm for the analysis of all types of high-throughput data. AGEP enables systematic and quantitative comparison of gene expression data from test samples against a comprehensive collection of different cell/tissue types previously studied by the entire research community.Peer reviewe
    • …
    corecore